compaq_armada_small_bw.gif (5457 bytes) Alf P. Steinbach´s personal homepage: Programming: Narrow Topics

Constant time insertion in random access lists

Contents:

Cursor gap arrays.
: Limitations of cursor gap arrays.
: Implementation considerations.

Random access with regard to an operation Op means that any element number k in a data structure, where k is in the range 1...n, n being the current number of elements, can be found in constant time, or, for files, effectively constant time, after which Op can be performed in time independent of k and n. Computer programming introductory books give the impression that if you want random access read/update you´ll have to use an array, and forsake constant time general insertion and deletion; conversely, if you want constant time general insertion and deletion you´ll have to use a linked data structure (e.g. a linked list) and forsake random access read/update. This impression is so strong that many programmers think it´s impossible to combine random access read/update and constant time general insertion and deletion.

The reason for the misunderstanding is probably that authors and lecturers find that it provides an easy-to-understand reason for using linked lists instead of simpler arrays, or vice versa, but then it´s a misguided oversimplification.

One data structure allowing the combination is called the cursor gap array. It was used extensively in early text editors, hence the name. The insert/delete position corresponded directly to the on-screen text cursor position.

Last updated Wednesday, February 03, 1999

Constant time insertion in random access lists

Cursor gap arrays.

A cursor gap array is an array where free positions are coalesced into a contiguous area called the cursor gap, corresponding to a sequentially movable insert/delete position called the cursor. As the cursor is moved, element values are copied from one end of the cursor gap to the other, keeping the cursor gap synchronized with the cursor. The copying overhead need not be excessive: if the element values are costly to copy pointers can be stored instead of the actual values, and when the element values are simple numbers or characters the copying overhead is negligible anyway.

cursor_gap_array.gif (7200 bytes)

The application´s view is a virtual resizable array implemented in terms of the cursor gap array, with its maximum size set by the cursor gap array. As elements are inserted the virtual array grows. As elements are deleted, the virtual array shrinks.

Since the virtual array can shrink to zero size it´s important that the cursor position is conceptually between elements, not at a particular element.

Limitations of cursor gap arrays.

In contrast to a linked list, a cursor gap array can at any time have only one constant time insert/delete position. However, operating with more than one insert/delete position in a linked list is usually to ask for trouble, since the insert/delete positions may interact with each other and with other references into the list. Thus, the limitation of one insert/delete position is mainly academic.

Also in contrast to a linked list, a cursor gap array has a maximum size established at construction. This limitation can be overcome by implementing a reallocation scheme (e.g. by using the operating system´s services for reallocation), at the cost of somewhat less efficient operation.

Finally, a cursor gap array does not allow external pointers to its elements, since element values may be moved about in the real array. In a language like C++ this limitation can be overcome by defining a dedicated "smart-pointer" class, C++ iterator. Whether it would be worth it depends on whether one would just use the class as syntactic sugaring or use it as an interface to the C++ standard library´s collection of iterator-based algorithms etc.

Implementation considerations.

Moving the cursor a long distance, e.g. to the start or end of the array, can be optimized by block-move instructions, by using DMA (Direct Memory Access) or by utilizing a graphics accelerator (hardware bitblt, blitter). Thus a wrapper class for a cursor gap array should provide operations to move the cursor arbitrarily within the array, in addition to sequential access, even though an arbitrary positioning cannot be done in constant time. For small arrays, say, less than some hundred KB, a properly optimized arbitrary cursor move can still be regarded as "effectively" constant time.

Copying a number of elements from one position to another can be similarly optimized.

Finally, moving the cursor from the start to the end, or vice versa, can be be implemented as a constant time operation by treating the real array as a circular array. If this is done, arbitrary cursor moves can be further optimized by always moving in the (circularly) shortest direction. However, these optimizations are on the point of diminishing returns.